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DETAILED ACTION 

Claims 3-12, 15, 19-26 are pending in this Application. 
This action is responsive to the Amendment filed January 8, 2008. 
Prosecution has been reopened. All previous presented rejections of the claims 
are hereby withdrawn as to being moot. See new office action below. 

Examiner's note: Examiner formally withdraws the objection to the specification 
and the rejection under 35 USC 1 01 for claims 1 9-24. 

Based on Applicant's remarks and new prior art found, a non-final office action 
is presented. 

Examiner's remarks regarding 35 USC §101: 

Regarding Claims 19-26, Examiner examined Applicant's claimed, "computer 
program product on computer readable storage medium". 
In the specification, Page 6, (Paragraph [0017] ), 
Applicant states: 

"The processor 1 10 executes a set of computer-executable program 
instructions stored in memory 108. Such processors may include a microprocessor, an 
ASIC, and state machines. Such processors include, or may be in communication with, 
media, for example computer-readable media, which stores instructions that, when 
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executed by the processor, cause the processor to perform the steps described herein." 
Examiner believes that a 35 USC 101 rejection is not warranted. 

Specification 

The disclosure is objected to because of the following informalities: In Applicant's 
instant application, (page 13, lines 1-3) prior art Dan Gusfield's, "Algorithm on Strings, 
Trees, and Sequences", has not been provided on an Information Disclosure form. 
Examiner requests a copy of this prior art for consideration. 

Appropriate correction is required. 



Claim Rejections - 35 USC 103 

The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or 
described as set forth in section 102 of this title, if the differences between the 
subject matter sought to be patented and the prior art are such that the subject 
matter as a whole would have been obvious at the time the invention was made to a 
person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 



The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1 , 148 
USPQ 459 (1966), that are applied for establishing a background for determining 
obviousness under 35 U.S.C. 103(a) are summarized as follows: 
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1 . Determining the scope and contents of the prior art. 

2. Ascertaining the differences between the prior art and the claims at issue. 

3. Resolving the level of ordinary skill in the pertinent art. 

4. Considering objective evidence present in the application indicating 
obviousness or nonobviousness. 

This application currently names joint inventors. In considering patentability of 
the claims under 35 U.S.C. 1 03(a), the examiner presumes that the subject matter of 
the various claims was commonly owned at the time any inventions covered therein 
were made absent any evidence to the contrary. Applicant is advised of the obligation 
under 37 CFR 1 .56 to point out the inventor and invention dates of each claim that was 
not commonly owned at the time a later invention was made in order for the examiner to 
consider the applicability of 35 U.S.C. 103(c) and potential 35 U.S.C. 102(e), (f) or (g) 
prior art under 35 U.S.C. 103(a). 

Claims 3-12, 15 and 19-22 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Brian L. Hazlehurst (U.S. Patent # 6,289,353 B1 and Hazelburst 
hereinafter) in view of Ion Muslea et al. (U.S. Patent# 6,606,625 B1 and Muslea 
hereinafter). 

Regarding Claims 12 and 23, Hazlehurst teaches accessing a plurality of related 
articles (i.e. corpus of documents) (Col 16, line 54); determining a article from related 
articles (i.e. incoming documents from multiple information sources (e.g., in-house 
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editorial staff, third-party news feeds, large databases, World Wide Web spiders) and 
feed documents) (Col 2, lines 40-43) ; identifying at least one information field (i.e. 
variable specifying threshold for maximum distance) (Col 21 , lines 41-42) within the 
article by comparing the article to at least one related article (i.e. spanned by a set of 
concepts which are central to a significant portion of the set of documents (Col 3, lines 
1-2) or (i.e. a first document regarding cars and second document relating to trucks in 
which similarity between the two documents is determined)(Col 4, lines 63-65) at least 
one related article (i.e. a first document regarding cars and second document relating to 
trucks in which similarity between the two documents is determined)(Col 4, lines 63-65) 
creating a template based on identified information field (i.e. arrangement in the 
recommended list by score) (Figure 16, item 233) and (i.e. convert documents to a 
standard format) (Col 7, lines 44-45) (Examiner notes that a template is inherit as a file 
format like documents) identifying a plurality of templates (i.e. formats) (Col 9, lines 10- 
12) comprising at least one information field (i.e. index 102 has document ID field and 
grinder ID field) (Figure 7, item 102) (Examiner notes that field is data that has several 
parts or rows and columns typically found in a database such as document table 
represented rows) (see Col 12, lines 19 and 22) comparing (i.e. find similarity between 
documents) (Figure 21, item 268 and 270) source article (i.e. corpus of documents) 
(Col 16, line 54) to the template (i.e. formats) (Col 9, lines 10-12) to determine the 
closest template (i.e. convert documents to a canonical source-independent format for 
use by the document indexing and storage system)(Col 7, lines 44-47) (see also 
distance metrics, Col 12, lines 12-16) associating data from the source article (i.e. 
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corpus of documents) (Col 16, line 54) with an information field (i.e. index 102 has 
document ID field and grinder ID field) (Figure 7, item 102) from the closest template 
(i.e. closest document vectors) (Col 12 line 34) (see also sorting list of document 
distances in increasing order)(Col 12, lines 25-26) and extracting data (i.e. extract the 
address) (Col 3, lines 26-27) or (i.e. extracted by liaison from user tank for user; these 
are the symbolic profile data which have been asserted by user about himself or 
herself.) (Col 24, lines 37-40). 

Hazleburst does not expressly teach seed. 

Muslea teaches seed (i.e. creating rule candidates based upon seed) (Col 20, 
line 43). 

It would have been obvious to a person of ordinary skill in the art at the time of 
Applicant's invention to modify the teachings of Hazleburst with the teachings of Muslea 
to include seed with the motivation to allow users to allow the user to gather information 
from an identified and semi-structured source, provide a stepping stone to an ultimate 
goal of harvesting information from unpredictable, but stable, information sources, and 
the user then has control over the information he or she wants and can choose almost 
any kind or type of information for return from the vast information reservoir. (Muslea, 
Col 3, lines 24-44). 
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Regarding Claims 4 and 20, Hazleburst teaches clustering of related articles 
(Figure 1, items 34,32, and 33). 

Regarding Claims 6 and 22, Hazleburst teaches information field performed by 
comparing article to cluster of articles (i.e. vehicles in relation to car and truck) (Figure 
1 , item 22) or (clustering of bone cancer to breast cancer are compared and given a 
region of user interest )(Figure 2, items 37, 38 and 39). 

Hazleburst does not expressly teach seed. 

Muslea teaches seed (i.e. creating rule candidates based upon seed) (Col 20, 
line 43). 

It would have been obvious to a person of ordinary skill in the art at the time of 
Applicant's invention to modify the teachings of Hazleburst with the teachings of Muslea 
to include seed with the motivation to allow users to allow the user to gather information 
from an identified and semi-structured source, provide a stepping stone to an ultimate 
goal of harvesting information from unpredictable, but stable, information sources, and 
the user then has control over the information he or she wants and can choose almost 
any kind or type of information for return from the vast information reservoir. (Muslea, 
Col 3, lines 24-44). 
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Regarding Claim 7, Hazleburst teaches variable data (i.e. variable specifying 
threshold)(Col 21 , lines 41-42). 

Regarding Claim 8, Hazleburst teaches web pages (i.e. web sites on the world 
wide web such as Medline or MDX health digests) (Col 7, lines 28-32). 

Regarding Claim 9, Hazleburst teaches web pages on the web site (i.e. web sites 
on the world wide web such as Medline or MDX health digests) (Col 7, lines 28-32). 

Regarding Claim 10, Hazleburst teaches content on the web page (Medline or 
MDX health digests) (Col 7, lines 28-32). 

Regarding Claim 11, Hazleburst teaches preserving visible text, visible images, 
paragraph and table formatting (i.e. reads on the web page) (Examiner notes that web 
pages maybe retrieved, therefore are preserved. Web pages usually contain links to 
images, media. Web page as an information set, can contain many kinds of 
information, which is able to be seen, heard or interact by the end user). 

Regarding Claims 24 and 25, Hazleburst teaches displaying (i.e. displays to a 
user) (Col 30, lines 17-18) and storing (i.e. storing in database) (Abstract). 
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It is noted that any citations to specific, pages, columns, lines, or figures in the 
prior art references and any interpretation of the references should not be considered 
to be limiting in any way. A reference is relevant for all it contains and may be relied 
upon for all that it would have reasonably suggested to one having ordinary skill in the 
art. See, MPEP 2123. 

Allowable Subject Matter 

Claim 15 is allowed over the prior art made of record. 
Applicant's particular the limitations directed at the extraction of data with 
identifying templates and comparing source article to the templates using dynamic 
programming alignment algorithm to compute edit distance between source article and 
the templates containing one information field in combination with the other limitations 
of the claims, was not disclosed by, would not have been obvious over, nor would have 
been fairly suggested by the prior art of record, in context to the claims and the 
specification. 

Claims 3,5, 19, 21 and 24 are objected to as being dependent upon a rejected 
base claim, but would be allowable if rewritten in independent form including all of the 
limitations of the base claim and any intervening claims. 
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Applicant's particular the limitations directed at the extraction of data with 
identifying templates and comparing source article to the templates using dynamic 
programming alignment algorithm to determine cluster of related seed articles from 
related articles based on edit distances, and compute edit distances between source 
article and the templates in combination with the other limitations of the claims, was not 
disclosed by, would not have been obvious over, nor would have been fairly suggested 
by the prior art of record, in context to the claims and the specification. 

The dependent claims, being further limiting to the independent claims, definite 
and enabled by the Specification are also allowed. The closest prior art fails to 
anticipate or render Applicant's limitations above obvious. 

Any comments considered necessary by applicant must be submitted no later 
than the payment of the issue fee and, to avoid processing delays, should preferably 
accompany the issue fee. Such submissions should be clearly labeled "Comments on 
Statement of Reasons for Allowance." 

Conclusion 

Any inquiry concerning this communication or earlier communications from 
If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Christian Chase can be reached on (571) 272-4190. The fax phone 
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numbers for the organization where this application or proceeding is assigned are (703) 
872-9306 for regular communications and (703) 305-3900 for After Final 
communication. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the receptionist whose telephone number is (571) 272- 
2100. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. 

For more information about the PAIR system, see http://pair-direct.uspto.qov. 
Should you have questions on access to the Private PAIR system, contact the 
Electronic Business Center (EBC) at 866-217-9197 (toll free). 

/Diane Mizrahi/ 

Diane.Mizrahi@USPTO.gov 
Primary Patent Examiner 
Technology Center 2100 

April 6, 2008 
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